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(54) A method and a device for recognising speech 



(57) According to the invention, speech recognition 
can be limited to a smaller number making use of cx>m- 
mands (16, 17) given by a user By means of a method 
and device, according to the invention, it is also possible 
to activate a speech recognition device by making use, 
in the activation, of an existing keytx>ard or a touch-sen- 
sitive screen of the device. A method, according to the 
inventbn, provides the user with a toglcal way to activate 
the speech recognition device in the device simultane- 
ously providing an improved capacity of the speech rec- 
ognition devbe. According to the invention, it is also pos- 



sible to carry out the speech recogn'itbn process by 
making use of commands given by the user irrespective 
of how the speech recognition device itself is activated. 
According to an embodiment of the inventk)n, the device 
comprises a touch-sensitive screen or surface, in which 
case the information about a single letter or several let- 
ters written on the screen is transmitted to the speech 
recognltbn device, whereupon speech recognitkxi is 
limited to words, wherein the tetters in quest bn occur. 
Speech recognition is most preferably limited to a name 
beginning with the tetter written on the touch screen by 
the user. 
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Description 

[0001] The present invention relates to a method for 
recognising speech and a device that utilises the speech 
recognition method according to the invention. 
[0002] Normally, in mobile telephones, it is possible 
to browse through a telephone notepad to select a name 
by making use of the first letter of the name searched 
for. In this case, when a user during the search presses, 
e.g. the letter "s", the names beginning with the letter 
■s" are retrieved from a memory. Thus, the user can 
more quickly find the name he/she is kx>king for without 
needing to browse through the content of the notepad 
in alphabetical order in order to find the name. This kind 
of method is fully manual and is based on the commands 
given by the user through a keyboard and the browsing 
of a memory based on this. 

[0003] Today, there are also some mobile stations that 
utilise speech recognitbn devices, wherein a user can 
give a command by voice. In these devices, the speech 
recognition device is often speaker-dependent; i.e. the 
operatk>n of the speech recognitbn device is based on 
that the user teaches the speech recognitk>n device 
words that the speech recognitbn device is supposed 
to later recognise. There are also soK^alled speaker-in- 
dependent speech recognition devices for whk:h no 
separate training phase is required. In this case, the op- 
eratk)n of the speech recognitbn devbe is based on a 
large anrK>unt of teaching material compiled from a large 
sampling of different types of speakers. Moderate oper- 
ation in case of a so-called average user is typbal of a 
speaker-independent recognition device. Correspond- 
ingly, a speaker-dependent speech recognition devce 
operates best for the person who has trained the speech 
recognition devbe. 

[0004] It is typbal of both speech recognition devices 
mentbned above that the performance of the speech 
recognition devbe greatly depends on how large a vo- 
cabulary is used. It is also typical of speech recognition 
devices according to prior art that they are limited to a 
specific number of words, which the speech recognitbn 
device is capable of recognising. For example, in mobile 
statbns, a user is provided with a maximum of 20 
names, whbh he/she can store in a notepad within the 
telephone by voice and, correspondingly, use these 
stored names in connection with voice selection. It is 
quite obvious that such a number is not suffbient in 
present or future applbatbns, where the objective is to 
substantially increase the number of words to be recog- 
nised. As the number ot words to be recognised increas- 
es, e.g. ten-fold, with current methods, it is not possible 
to maintain the same speech recognitbn capacity as 
when using a smaller vocabulary. Another limiting factor, 
e.g. in terminal equipment, is the need for a memory to 
be used, which naturally increases as the vocabulary of 
the speech recognitbn devbe expands. 
[0005] In current speech recognition devices accord- 
ing to prior art, the activation of a speech recognition 



device can be implemented by voice using a specific ac- 
tivatbn command, such as e.g. "ACTIVATE", whereup- 
on the speech recognition devbe is activated and is 
ready to receive commands from a user. A speech rec- 
s ognition device can also be activated with a separate 
key. It is typical of speech recognition devbes activated 
by voice that the performance of the activation is de- 
pendent on the noise level of the surroundings. Also dur- 
ing the operation of the speech recognition devbe, the 
noise level of the surroundings greatly affects the per- 
fomiance of the speech recognitbn devbe to be 
achieved. It can be said that critical parameters for the 
performance of a speech recognition devbe are the ex- 
tent of the vocabulary and the noise conditbns of the 
surroundings. 

[0006] A further known speech recognition system is 
disclosed in US 4,866,776 where a user can select a 
sub-vocabulary of words by selecting an initial string of 
one or more letters causing the recognitbn to be per- 
formed against the sub-vocabutary restrbted to words 
starting with those initial letters. 
[0007] Now, we have invented a method and a device 
for recognising speech the objective of which is to avob 
or, at least, to mitigate the above-mentbned problems 
of prior art. The present inventbn relates to a devbe and 
a method, wherein a user is allowed to give, during 
speech recognition, a qualifier by means of which 
speech recognitbn is only limited to those speech mod- 
els that correspond with the qualifier provided by the us- 
er. In this case, only a specific sub-set to be used during 
speech recognition is selected from the prestored 
speech models. 

[0008] According to an embodiment of the inventbn, 
a speech recognition device is activated at the same 
time as a qualifier that limits speech recognitbn is pro- 
vbed by touching the devbe making use of the existing 
keytx>ard or touch-sensitive screen/base of the devbe. 
The activation is most preferably implemented with a 
key. A method according to the inventkxi provbes a user 
with a bgical way to activate the speech recognitbn de- 
vbe therein at the same time providing an improved per- 
formance of the speech recognition devbe along with 
the entered qualifier. The limitatbn of speech recogni- 
tion according to the inventbn can also be implemented 
apart from the activatbn of the speech recognition de- 
vbe. 

[0009] According to an exemplary embodiment of the 
invention, the devbe comprises a touch^ensrtive 
screen or surface (base), whereupon the information 
about the character or several characters written on the 
screen is transmitted to the speech recognitbn device, 
in which case speech recognitbn is limited to words 
wherein the characters in questbn occur Speech rec- 
ognition is most preferably limited to a name beginning 
with the character written by the user on the touch 
screen. 

[001 0\ According to an exemplary embodiment of the 
invention, speech recognition can also be implemented 
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by making use in advance of all the stored nrKxiels and 
by utilising the limiting qualifier provided by the user 
when defining the final recognition result. 
[0011] According to a first aspect of the invention 
there is provided a method for recognising an utterance 
of a user with a device, wherein a set of models of the 
utterances have been stored in advance and for speech 
recognition, the utterance of the user is received, the 
utterance of the user is compared with the prestored 
models and, on the basis of the comparison, a recogni- 
tion decision is made, the method being characterised 
in that. 

the user is allowed to provide a qualifier limiting the 
comparison by touching the device, the qualifier 
ttentifying an item in a menu structure of the device, 
a sub-set of models is selected from the stored 
models on the basis of the qualifier provided by the 
user said sub-set of models identifying sub-items of 
the menu structure, and 

a comparison is made for making the recognition 
decisbn by comparing the utterance of the user with 
said sub-set of models. 

[001 2] According to a second aspect of the inventkxi 
there is provided a method for recognising an utterance 
of a user with a device, wherein a set of models of the 
utterances have been stored in advance and for speech 
recognition, the utterance of the user is received, the 
utterance of the user is compared with the prestored 
models and, on the basis of the comparison, a recogni- 
tion deciskxi is made, the method being characterised 
in that, 

a comparison is made for making a first reoognitkxi 
decisbn by comparing the uttercince of the user with 
the prestored models, 

the user is albwed to provide a qualifier limiting the 
comparison by touching the devk:e for selecting a 
sub-set of models, the qualifier itentifying an item in 
a menu structure of the device and said sut>-set of 
models identifies sub-items of the menu structure, 
a final comparison is made for making the recogni- 
tkxi deciskxi by comparing the first recognitton de- 
cisk)n with said sub-set of models. 

[0013] According to a third aspect of the inventkxi 
there is provided a device comprising a speech recog- 
nit»n device for recognising the utterance of a user, 
memory means for storing speech models, and means 
for receiving the utterance of the user, comparison 
means for carrying out the recognition process by com- 
paring the utterance of the user with the models stored 
In the memory means, the device being characterised 
in that the device also comprises means for receiving a 
qualifier from the user by touching the device, means 
for selecting a set from the stored models on the basis 
of the qualifier received from the user for limiting the 



comparison made by the comparison means to said set 
of rrKxIels and means for storing a menu structure of a 
device and for identifying the received qualifier as an 
item in a menu structure of the device. 

5 

Figure 1 shows the structure of a speech recognition 
device, according to prior art. as a bkx^k di- 
agram, 

10 Figure 2 shows the structure of a speech recognition 
device, according to the invention, as a 
block diagram, 

Figure 3 shows the operation of a method, accord- 
'5 ing to the invention, as a flowchart, 

Figure 4 shows the c^eratbn of another method, ac- 
cording to the invention, as a flowchart, and 

20 Figure 5 shows the structure of a mobile station uti- 
lising a method according to the inventbn. 

[0014] Figure 1 shows the bkx;k diagram structure of 
a known speech recognftk>n device as applicable to the 

2S present inventkxi. Typically, the operation of the speech 
recognitk>n device is divkied into two different main ac- 
tivities: an actual speech recognitbn phase 10-12, 
14-15 and a speech training phase 10-13, as shown in 
Figure 1 . The speech recognitkyi device receives from 

30 a microphone as its input a speech signal S(n), whk^ is 
converted into a digital form by an A/D converter 1 0 us- 
ing, e.g. a sampling frequency of 8 kHz and a 1 2-bit res- 
olution per sample. Typcally, the speech recognition de- 
vbe comprises a socalled front-end 11. wherein the 

3S speech signal is analysed and a feature vector 12 is 
modelled, the feature vector describing the speech sig- 
nal during a specific perkxJ. The feature vector is de- 
fined, e.g. at 10 ms intervals. The feature vector can be 
modelled using several different techniques. For exam- 

^ pie, different kinds of techniques for modelling a feature 
vector have been presented tn the reference: J. Pkx>ne. 
■Signal modeling technk^ues in speech recognition". 
IEEE Proceedings. Vol. 81, ISto. 9. pp. 1215-1247, Sep- 
tember 1993. During the training phase, nrxxiels are 

45 constructed by means of the feature vector 1 2, in a train- 
ing block 13 of the speech recognition device, for the 
words used by the speech recognitk>n device. In model 
training 1 3a, a nrKxJel is defined for the word to be rec- 
ognised. In the training phase, a repetitbn of the word 

50 to be nnodelled can be utilised. The nrKxiels are stored 
in a memory 1 3b. During speech recognitbn. the feature 
vector 12 is transmitted to an actual recognition devbe 
14, which compares, in a block 15a, the models con- 
structed during the training phase with the feature vec- 

55 tors to be constructed of the speech to be recognised, 
and the decision on the recognition result is made in a 
bkxik 15b. The recognition result 15 denotes the word, 
stored in the memory of the speech recognition device. 
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that best corresponds with the word uttered by a person 
using the speech recognition device. 
[001 5] Figure 2 shows the operation of a speech rec- 
ognition device according to the inventbn where, in ad- 
dition to the solution according to Figure 1 , the speech 
recognition device comprises a block 16, wherein the 
selection of the rrKxJels is carried out on the basis of the 
commands given by a user, e.g. through a keyt)oard. 
The block 16 receives as its input a signal 17 containing 
the informatbn on which key the user has pressed. In 
the block 16. speech models IB. transmitted by the 
block 13b, are compared with the signal 17 and a sub- 
set 19 is selected from these and transmitted to the 
block 15a of the speech recognition device. The selec- 
tbn of the models relating to the operatk>n of the block 
16 has been described betow making use of a memory 
structure according to the present inventk>n. 

Table 1 



Name 


Number 


Reference Model 


Smith Paul 


0405459883 


XXX... X 








Table 2 



Menu 








Phone Settings 






Messages 








Read messages 






Write messages 




Memory Functk>ns 





[0016] Table 1 shows a merTX)ry structure according 
to the inventk)n, which may form. e.g. a phone notepad 
of a mobile station or part of it. The memory comprises 
the name of a person, a telephone number that corre- 
sponds with the name, as well as a reference model (e. 
g. a feature vector) constructed during the speech rec- 
ognition training phase. The table shows as an example 
one line of the table, on which a person's name 'Smith", 
a corresponding telephone number "0405459883", as 
well as a data field containing a reference model "xxx... 
x' are stored. The length of the reference model is a 
speech recognition device-specific paranr^ter and, 
therefore, the fiekJ length depends on the speech rec- 
ognition device used. According to the invention, when 
a user presses a specific key of the device, e.g. the key 
"s", the processor of the device goes through the content 
of the memory and compares the content of the data 
field containing the name and only retrieves from the 
memory the names that begin with the letter "s". The 
comparison can be made, e.g. by comparing the ASCII 
character of the pressed key with the ASCII character 
of the first letter of the name in the memory and by se- 



lecting the reference model that corresponds with the 
name provided that the letters correspond with one an- 
other in the comparison. The information on the selected 
reference nruxlels (sub-set) is then transmitted to the 

5 speech recognition device, after which the speech rec- 
ognition device carries out speech recognition using the 
models that relate to the names selected at>ove. 
[001 7] The user can further also press another key, e. 
g. the key *m". whereupon speech recognition is limited 

10 to the names beginning with the letter combination 
"Sm". In this case, the number of names to be recog- 
nised can be further limited, i.e. the sub-set of models 
decreases. In additk>n. it is also possible that the mem- 
ory contains fields other than the above-mentioned 

IS name field on the basis of wh ich the speech recognitkxi 
device is activated according to the inventk)n. The tele- 
phone memory of a devk:e, such as a mobile statbn, 
may contain, e.g. a field that indicates whether a specific 
number is a nrK>bile station number or not. In this case, 

20 the memory field may coitain, e.g. an element 'GSM*, 
whereupon when the user activates this field only the 
GSM numbers are selected and not the others, e.g. the 
numbers of a fixed network or fax numbers. Thus, the 
invention is not limited to a case wherein the letter se- 

2S lected by the user controls the operation of the speech 
recognition devk:e but, instead, the user may select 
names, e.g. from a telephone notepad according to 
some other classification. For example, the names in a 
telephone notepad may have been divkJed into classes 

30 like "Home". "Office", "Friends", in which case the mo- 
bile station may provkle a convenient way to select from 
the menu, e.g. the class 'Friends', whereupon speech 
recognitk>n is directed to the names in this class, ac- 
cording to the inventbn. It is also possible that the mo- 

3S bile station comprises a keytx>ard, wherein several dif- 
ferent characters are combined in a specific key. For ex- 
ample, the letter symbols 'j, k, I" may be included in the 
numeric key '5'. In this case, the inventbn can be ap- 
plied so that when the user presses the key '5'. the 

^ speech recogn rtbn device is activated so that, in speech 
recognitbn, it is limited to names beginning with the let- 
ters "j", "k" or "I". In an exemplary embodiment of the 
invention, when the user presses the key SEND speech 
recognitbn. according to the invention, can be limited, 

^ e.g. to the last made calls (e.g. the last 1 0 calls). In this 
case, a call can be commenced, e.g. by pressing and 
hokJing the key SEND, while the user simultaneously 
pronounces the name to be recognised, whereupon 
speech recognitbn is limited to a set of models contain- 

50 ing the name/symbol of the last 1 0 calls. 

[0018] The speech recognition devbe is most prefer- 
ably activated by a press-and-hold. whereupon the de- 
vbe (speech recognition device) is informed by the 
pressing and hokJing of the key in question that you want 

55 speech recognition to commence. At the same time, the 
informatbn about the pressed key is transmitted to the 
speech recognition devbe. i.e. speech recognition is 
limited, e.g. to words beginning with the letter on the key. 
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whereupon only the reference nrKxlels that the user 
wants are activated. It is also according to the invention 
that the speech recognition device is activated in a way 
other than by pressing a key, for example, by voice. In 
this case, after the activation of the speech recognition 
device, it is possible to use, during speech recognition, 
the reference model selection according to the Invention 
as presented above. 

[0019] An arrangement, according to the invention, 
can also be made for the menu stmcture of a mobile 
station, as shown in Table 2. Table 2 shows a specific 
part of the menu stmcture of a telephcxie. In this exam- 
ple, the main menu consists of the menus "Telephone 
Settings', "Messages' and "Memory Functions". Corre- 
spondingly, the menu "Messages" consists of the sub- 
menus "Read messages" arKt "Write messages". When 
a user of the telephone selects a menu function by voice 
or pressing a menu key, activation is limited to the points 
in the menu. In the example, activation by voice is di- 
rected to the menus "Telephone Settings", Messages" 
or "Memory Functkxis". The user can further manually 
select the submenu "Messages", in which case activa- 
tion by voice is directed to the points "Read messages' 
or "Write messages" of the menu in question. The 
above-mentioned method can also be applied to exte- 
nnal sen^k^es for a mobile statkxi and their activation. In 
this case, a specifk; key of the mobile station is defined 
for a specific sen^rce, e.g. for a WWW sendee (Workl 
WkJe Web). In this case, the pressing and hokJing of the 
key in questbn enables, e.g. the selectbn of a book- 
nnark of WWW addresses by using a vok:e connmand. 
In this applk:ation, the nrK)bile statbn contains a table of 
letter symbols, which are selected as described above. 
[0020] Figure 3 shows the activity sequence of a 
method according to the inventk>n. In Phase 30, it is dis- 
covered whether or not a user has carried out the press- 
and-hold that activates the speech recognition device. 
If no press-and-hold is discovered, the device remains 
in a state where the activatkxi of the speech recognition 
device is being anticipated. Alternatively, the speech 
recognition devrce can be activated at the same time, 
when the user starts writing on a touch-sensitive sur- 
face, such as a screen. Ttie activatkxi of the speech rec- 
ognitkxi device may also be based on voice. In Phase 
31, the letter/text written on the touch screen is recog- 
nised. In Phase 32, the informatk>n about the pressing 
of the key is transmitted to the speech recognition de- 
vice and/or the informatkxi about the alphanumeric 
character written or drawn by the user on the touch 
screen is transmitted. It is also possible to draw, on the 
touch screen, some other figure that deviates from an 
alphanumeric character, which is utilised in speech rec- 
ognitkm. In Phase 33, it is examined whether or not the 
user is still carrying out the pressing of keys or writing 
on the touch screen, in which case the informatbn about 
these activities is also transmitted to the speech recog- 
nitbn device. This can be done by comparing the activ- 
ities of the user with a specific time threshold value by 



means of which it is deckJed whether or not the user has 
concluded the giving of commands. In Phase 34, the 
word pronounced by the user is recognised by making 
use of the information provided in Phase 32. 

s [0021 ] Figure 4 shows another activity sequence of a 
method according to the invention. In this method, the 
pronounced word is first recognised traditionally and, 
only after this, the limitatbn provkded by the user is uti- 
lised to limit the result obtained during the recognitkxi 

10 phase. In Figure 4, Phases 30-33 correspond with the 
corresponding phases in Figure 3. In Phase 35, the ut- 
terance of the user is recognised by making use of all 
the prestored models. The informatbn about this recog- 
nition result is transmitted to Phase 34. wherein the final 

IS recognitkxi decision is made by comparing the first rec- 
ognition deciskxi with said sub-set of rrxxjels, whk;h has 
been obtained on the basis of the limitatkxi provided by 
the user. The recognitkxi deciskxi obtained from Phase 
35 contains a set of proposed words that have been rec- 

20 ognised and the recognitkxi probabilities corresponding 
with the words, whbh are transmitted to Phase 34. In 
case of a faulty recognition, the word that has got the 
highest recognition probability is not the word pro- 
nounced by the user. In this case, in Phase 34 according 

2S to the invention, it is possible to carry out the final 
speech recognition phase by means of the qualifier pro- 
vkJed by the user and reach a higher speech recognitkxi 
performance according to the inventbn. A method ac- 
cording to the invention may also operate so that the 

30 giving of a limitation and the recognitkxi of a pronounced 
word are essentially simultaneous activities. 
[0022] Figure 5 shows the structure df a mobile sta- 
tion, which has a speech recognitkxi device 66 that uti- 
lises the present invention. The mobile station compris- 

35 es parts typrcal of the device, such as a microphone 61 , 
a keyboard 62. a screen 63, a speaker 64, and a control 
bkx;k 65, which controls the operatkxi of the mobile sta- 
tion. According to an embodiment of the inventkxi, the 
screen 63 may be a touch-sensitive surface, such as a 

^ screen. In addition, the figure illustrates transmitter and 
receiver blocks 67, 68 typcal of a mobile station. The 
control bkx^k 65 also controls the operatkxi of the 
speech recognition device 66 in connection with the mo- 
bile statkxi. When the speech recognition devbe is be- 

^ ing activated either during the speech recognitkxi de- 
vk;e's training phase or during the actual speech recog- 
nition phase, the vobe commands given by the user are 
transmitted, controlled by the control block, from the mi- 
crophone 61 to the speech recognition devk;e 66. Ac- 

so cording to the invention, the control block 65 transmits 
to the speech recognitbn devk;e 66 the informatkxi 
about the commands given by the user through keys or 
about the alphanumeric character/figure entered on to 
the touch screen. The voice commands can also be 

55 transmitted through a separate HF (hands free) micro- 
phone. The speech recognition devkie is typically imple- 
mented by means of DSP and it comprises external and/ 
or intemal ROM/RAM circuits 69 necessary tor its oper- 
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ation. CI 
[0023] An embodiment of the present invention may 
comprise a device, e.g. a mobile station, which has a 1. 
touch-sensitive surface, such as a touch^ensitive 
screen or base. In this case, a user writes the first letter ^ 
of the word to be recognised on the touch-sensitive sur- 
face, e.g. with a pen or draws with a finger and simulta- 
neously pronounces the word to be recognised (alter- 
natively, the user presses the point of the letter dis- 
played on the screen). In this case, the information io 
about the provided letter is transmitted to the speech 
recognfticx) device and speech recognition is limited to 
words in which the letter in question occurs. Recognition 
is most preferably limited to words that begin with the 
initial in question as described above. In this case, the is 
user may write, according to the invention, on the touch- 
sensitive surface, e.g. the letter 'S' and simultaneously 
pronounce the name to be recognised, e.g. 'Smith', 
whereupon speech recognition is limited to names be- 
ginning with the letter "S". 20 
[0024] Alternatively, the user may first write a letter on 
the touch screen and, after this, pronounce the word to 
be recognised. The above-mentioned method based on 
keys and writing on a touch-sensitive surface can also 2. 
be combined, in which case the user can both write on 2S 
a touch-sensitive surface and press some key and uti- 
lise both of these data in speech recognition. The touch- 
sensitive surface in itself is outskJe this invention, and it 
can be implemented in vark>us ways according to prbr 
art. 30 
[0025] It can be estimated that with a method, accord- 
ing to the present invention, a recognitkxi accuracy, 
whk;h is 10-30-fokJ compared with recognitktn devices 
according to prk>r art can be achieved if the number of 
names to be recognised remains the same. On the other 3S 
hand, by means of the invention, it is possible to recog- 
nise according to the invention 10-30 times as many 
names can be recognised while the recognitk)n accura- 
cy remains unchanged. This improved capacity is based 
on a combination, according to the inventkxi, whereup- 40 
on commands given by the user through keys/a touch- 
sensitive surface, i.e. qualifiers limiting speech recogni- 
tion search, are combined with speech recognition. One 
exemplary embodiment dt the invention was based on 
the utilisation of a touch screen. An advantage of this 45 
application is that the algorithms used in text recognition 3. 
and speech recognitk)n are almost identbal. whereupon 
the amount of program memory required does not in- 
crease much in a device, where both of these functions 
are implemented. so 4. 

[0026] Above we described a mobi le statk>n as an ex- 
emplary embodiment of the present inventon. However, 
the inventk)n can equally well be applied, for example, 
to computers. The present invention is not limited to the 5. 
emtxxJiments presented atx>ve. and it can be modified ss 
within the framework of the enck>sed claims. 



A method for recognising an utterance of a user with 
a device, wherein a set of models of the utterances 
have been stored in advance and for speech rec- 
ognition, the utterance of the user is received, the 
utterance of the user is compared with the prestored 
models and, on the basis of the comparison, a rec- 
ognition decisnn is made, characterised in that, 
in the method, 

- the user is alk>wed to provide a qualifier limiting 
the comparison by touching the device, the 
qualifier itentifying an item in a menu structure 
of the device. 

a sub-set of riKX^els is selected from the stored 
models on the basis of the qualifier provided by 
the user sakJ sub-set of models identifying sub- 
items of the menu structure, and 

- a comparison is made for making the recogni- 
tion decision by comparing the utterance of the 
user with said sub-set models. 

A method for recognising an utterance of a user with 
a device, wherein a set of models of the utterances 
have been stored in advance and for speech rec- 
ognition, the utterance of the user is received, the 
utterance of the user is compared with the prestored 
rrxxJels and, on the basis of the comparison, a rec- 
ognition decisktfi is made, characterised in that, 
in the method, 

a comparison is made for making a first recog- 
nitkvi decision by comparing the utterance of 
the user with the prestored models, 
the user is al towed to provide a qualifier limiting 
the comparison by touching the device for se- 
lecting a sub-set of models, the qualifier itenti- 
fying an item in a menu structure of the device 
and said sub-set of models identifies sub-items 
of the menu structure, 

a final comparison is made for making the rec- 
ognitksn decision by comparing the first recog- 
nition decisKMi with said sub-set of models. 

A method according to claim 1 or 2, characterised 
in that a speech recognition device is activated in 
response to the qualifier provided by the user. 

A method according to claim 1 or 2, characterised 
in that the user is allowed to give said qualifier by 
pressing a key. 

A method according to claim 1 or 2, characterised 
in that the user is albwed to provide said qualifier 
by writing an alphanumeric character on a touch- 
sensitive surface of the device. 
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6. A method according to claim 3 or 4, characterised 
in that the user is allowed to provide said qualifier 
as a press-and-hold. 

7. A device comprising a speech recognition device 
(66) for recognising the utterance of a user, memory 
means (69) for storing (1 3b) speech models, and 
means (61 ) for receiving the utterance of the user, 
comparison means (19, 15a, 15b) for carrying out 
the recognition process by comparing the utterance 
of the user with the models stored in the memory 
means, characterised in that the device also com- 
prises means (62, 63) for receiving a qualifier (17) 
from the user by touching the device, means (16) 
for selecting a set from the stored models on the 
basis of the qualifier received from the user for linD- 
iting the comparison made by the comparison 
means (19, 15a, 15b) to said set of nrKxIels and 
means (65) for storing a menu structure of a device 
and for identifying the received qualifier as an item 
in a menu structure of the device. 

8. A device according to claim 7, characterised in that 
the means for receiving the qualifier from the user 
comprise a keyboard. 2S 

9. A device according to claim 7, characterised in that 
the means for receiving the qualifier comprise a 
touch-sensitive surface. 

30 

1 0. A device according to claim 7, characterised in that 
it conoprises means (62, 63, 65) for activating the 
speech recognition device in response to the qual- 
ifier received from the user. 

35 
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(54) A method and a device for recognising speech 



(57) According to the invention, speech recognition 
can be limited to a smaller number making use of com- 
mands (16, 17) given by a user. By means of a method 
and device, according to the inventkxi, it is also possible 
to activate a speech recognition device by making use, 
in the activation, of an existing keyt>oard or a touch-sen- 
sitive screen of the device. A method, according to the 
invention, provides the user with a bgical way to activate 
the speech recognition device in the device simultane- 
ously providing an improved capacity of the speech rec- 
ognition device. According to the invention, it is also pos- 



sible to carry out the speech recognitkxi process by 
making use of commands given by the user irrespective 
of how the speech recognition device itself is activated. 
According to an embodiment of the invention, the device 
comprises a touch-sensitive screen or surface, in which 
case the information about a single letter or several let- 
ters written on the screen is transmitted to the speech 
recognition device, whereupon speech recognitkxi is 
limited to words, wherein the letters in questbn occur. 
Speech recognitbn is most preferably limited to a name 
beginning with the letter written on the touch screen by 
the user 
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